Load Libraries

# load in libraries ----
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(janitor)

Attaching package: 'janitor'

The following objects are masked from 'package:stats':

    chisq.test, fisher.test
library(dplyr)
library(ggplot2)
library(maps) 

Attaching package: 'maps'

The following object is masked from 'package:purrr':

    map

Load in Data

data_wd <- "/Users/oliviaholt/Documents/eds240/Holt-eds240-HW4/data"
# import data ----
fourteener_data <- read_csv(file.path(data_wd, "14er.csv"))
Rows: 58 Columns: 16
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): Mountain Peak, Mountain Range, fourteener, Standard Route, Difficu...
dbl (10): ID, Elevation_ft, Prominence_ft, Isolation_mi, Lat, Long, Distance...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Clean Data

# clean data ----
fourteener_clean <- fourteener_data %>% 
  clean_names()

Which option do you plan to pursue: Option 1

Restate your question. Has this changed at all since HW #1:

What makes a route the most popular in high traffic? prominence, isolation, elevation etc.

Explain which variables from your data set(s) you will use to answer your question(s):

traffic (high), elevation gain, prominence, distance, latitude, longitude.

In HW #2, you should have created some exploratory data viz to better understand your data. You may already have some ideas of how you plan to formally visualize your data, but it’s incredibly helpful to look at visualizations by other creators for inspiration. Find at least two data visualizations that you could (potentially) borrow / adapt pieces from. Link to them or download and embed them into your .qmd file, and explain which elements you might borrow (e.g. the graphic form, legend design, layout, etc.).

The bubble plot I saw from data to viz and thought it would be a minimal but effective way to show my question. And the ternary plot from discussion I really liked so I wanted to try one of those.

Alt text here

Alt text here

Hand-draw your anticipated three visualizations (option 1) or infographic (option 2). Take a photo of your drawing and embed it in your rendered .qmd file – note that these are not exploratory visualizations, but rather your plan for your final visualizations that you will eventually polish and submit with HW #4.

Alt text here

Alt text here

#BUBBLE PLOT

# Libraries
library(ggplot2)
library(dplyr)
library(hrbrthemes)
NOTE: Either Arial Narrow or Roboto Condensed fonts are required to use these themes.
      Please use hrbrthemes::import_roboto_condensed() to install Roboto Condensed and
      if Arial Narrow is not on your system, please see https://bit.ly/arialnarrow
library(viridis)
Loading required package: viridisLite

Attaching package: 'viridis'
The following object is masked from 'package:maps':

    unemp
# The dataset is provided in the gapminder library
#library(gapminder)
#data <- gapminder %>% filter(year=="2007") %>% dplyr::select(-year)

# Most basic bubble plot
fourteener_clean %>%
  #arrange(desc(isolation_mi)) %>%
  #mutate(country = factor(country, country)) %>%
  ggplot(aes(x=elevation_gain_ft, y=traffic_high, size=isolation_mi, fill=difficulty)) +
    geom_point(alpha=0.5, shape=21, color="black") +
    scale_size(range = c(.1, 24), name="") +
    scale_fill_viridis(discrete=TRUE, guide=FALSE, option="A") +
    theme_ipsum() +
    theme(legend.position="bottom") +
    ylab("Elevation Gain (ft)") +
    xlab("Traffic (high)") +
    theme(legend.position = "none")

#put labels on most popular peaks
# # load ggplot2
# library(ggplot2)
# library(hrbrthemes)
# 
# # Transparency
# ggplot(fourteener_clean, aes(x=traffic_high, y=elevation_gain_ft, alpha=difficulty)) + 
#     geom_point(size=6, color="#69b3a2") +
#     theme_ipsum()
library(ggtern)
Registered S3 methods overwritten by 'ggtern':
  method           from   
  grid.draw.ggplot ggplot2
  plot.ggplot      ggplot2
  print.ggplot     ggplot2
--
Remember to cite, run citation(package = 'ggtern') for further info.
--

Attaching package: 'ggtern'
The following objects are masked from 'package:ggplot2':

    aes, annotate, ggplot, ggplot_build, ggplot_gtable, ggplotGrob,
    ggsave, layer_data, theme_bw, theme_classic, theme_dark,
    theme_gray, theme_light, theme_linedraw, theme_minimal, theme_void
# Create bins for 'isolation_mi'
fourteener_clean$isolation_bin <- cut(fourteener_clean$isolation_mi, breaks = c(0, 20, 40, 60, 80, 100, 120), labels = c("0-175", "176-350", "351-525", "526-700", "more", "and more"))

# Create a ternary plot
ternary_plot <- ggtern(data = fourteener_clean, aes(x = traffic_high,
                                                    y = elevation_gain_ft,
                                                    z = prominence_ft,
                                                    size = distance_mi,
                                                    fill = isolation_bin)) +
  geom_point(alpha = 0.5, shape = 21, color = "black") +
  scale_size(range = c(2, 8), name = "") +
  #scale_fill_viridis(discrete = TRUE, guide = FALSE, option = "A") +
  theme_ipsum() +
  theme(legend.position = "bottom") +
  ylab("Elevation Gain (ft)") +
  xlab("Traffic (high)") +
  theme(legend.position = "none")

# Print the ternary plot
print(ternary_plot)

# load United States state map data
MainStates <- map_data("state")
st_CO <- MainStates %>% 
  filter(region == "colorado")

# read the state population data
StatePopulation <- read.csv("https://raw.githubusercontent.com/ds4stats/r-tutorials/master/intro-maps/data/StatePopulation.csv", as.is = TRUE)

#plot all states with ggplot2, using black borders and light blue fill
ggplot() + 
  geom_polygon(data = st_CO, aes(x=long, y=lat, group=group),
                color="black", fill="tan" )+
  geom_point(data = fourteener_clean, aes(x = long, y = lat,
                                          #size = traffic_high,
                                          fill = difficulty), #changed from isolation_mi to difficulty
                                          #maybe bin the isolation data and then change fill the isolation
             alpha=0.5, shape=21, color="black", size = 4)+
  #scale_size(range = c(6, 8), name="Traffic (high)") +
    scale_fill_viridis(discrete=TRUE, guide=FALSE, option="A") +
    theme_ipsum() +
    theme(legend.position="bottom") +
    ylab("Latitude") +
    xlab("Longitude") +
    theme(legend.position = "none")

Having multiple legends and how to select one legend while leaving another one out. Also there are so many options and ways to change the visuals of a plot its hard to know what will work. The axes on the ternary plot is also giving me trouble.

Tidyverse, janitor, dplyr, ggplot2, maps and ggtern. I think we have used all of these in class except for ggtern which I used for the ternary plot.

Maybe for the color palette or the binning of data specifically for the size argument in aes().